probabilistic distance
Attention Head Embeddings with Trainable Deep Kernels for Hallucination Detection in LLMs
Oblovatny, Rodion, Bazarova, Alexandra, Zaytsev, Alexey
--We present a novel approach for detecting hallucinations in large language models (LLMs) by analyzing the probabilistic divergence between prompt and response hidden-state distributions. Counterintuitively, we find that hallucinated responses exhibit smaller deviations from their prompts compared to grounded responses, suggesting that hallucinations often arise from superficial rephrasing rather than substantive reasoning. T o enhance sensitivity, we employ deep learn-able kernels that automatically adapt to capture nuanced geometric differences between distributions. Our approach outperforms existing baselines, demonstrating state-of-the-art performance on several benchmarks. The method remains competitive even without kernel training, offering a robust, scalable solution for hallucination detection. In recent years, large language models (LLMs) have been widely adopted in many applications. However, they often generate hallucinations -- incorrect or fabricated content that does not match real-world facts or the provided context [1]. The latter case is of special interest, as it refers to incorrect generations in retrieval-augmented generation (RAG) settings, where LLMs rely on retrieved information to answer user queries.
Uncertainty-aware t-distributed Stochastic Neighbor Embedding for Single-cell RNA-seq Data
Nonlinear data visualization using t-distributed stochastic neighbor embedding (t-SNE) enables the representation of complex single-cell transcriptomic landscapes in two or three dimensions to depict biological populations accurately. However, t-SNE often fails to account for uncertainties in the original dataset, leading to misleading visualizations where cell subsets with noise appear indistinguishable. To address these challenges, we introduce uncertainty-aware t-SNE (Ut-SNE), a noise-defending visualization tool tailored for uncertain single-cell RNA-seq data. By creating a probabilistic representation for each sample, Our Ut-SNE accurately incorporates noise about transcriptomic variability into the visual interpretation of single-cell RNA sequencing data, revealing significant uncertainties in transcriptomic variability. Through various examples, we showcase the practical value of Ut-SNE and underscore the significance of incorporating uncertainty awareness into data visualization practices. This versatile uncertainty-aware visualization tool can be easily adapted to other scientific domains beyond single-cell RNA sequencing, making them valuable resources for high-dimensional data analysis.
No Need to Sacrifice Data Quality for Quantity: Crowd-Informed Machine Annotation for Cost-Effective Understanding of Visual Data
Klugmann, Christopher, Mahmood, Rafid, Hegde, Guruprasad, Kale, Amit, Kondermann, Daniel
Labeling visual data is expensive and time-consuming. Crowdsourcing systems promise to enable highly parallelizable annotations through the participation of monetarily or otherwise motivated workers, but even this approach has its limits. The solution: replace manual work with machine work. But how reliable are machine annotators? Sacrificing data quality for high throughput cannot be acceptable, especially in safety-critical applications such as autonomous driving. In this paper, we present a framework that enables quality checking of visual data at large scales without sacrificing the reliability of the results. We ask annotators simple questions with discrete answers, which can be highly automated using a convolutional neural network trained to predict crowd responses. Unlike the methods of previous work, which aim to directly predict soft labels to address human uncertainty, we use per-task posterior distributions over soft labels as our training objective, leveraging a Dirichlet prior for analytical accessibility. We demonstrate our approach on two challenging real-world automotive datasets, showing that our model can fully automate a significant portion of tasks, saving costs in the high double-digit percentage range. Our model reliably predicts human uncertainty, allowing for more accurate inspection and filtering of difficult examples. Additionally, we show that the posterior distributions over soft labels predicted by our model can be used as priors in further inference processes, reducing the need for numerous human labelers to approximate true soft labels accurately. This results in further cost reductions and more efficient use of human resources in the annotation process.
Towards Case-Based Preference Elicitation: Similarity Measures on Preference Structures
While decision theory provides an appealing normative framework for representing rich preference structures, eliciting utility or value functions typically incurs a large cost. For many applications involving interactive systems this overhead precludes the use of formal decision-theoretic models of preference. Instead of performing elicitation in a vacuum, it would be useful if we could augment directly elicited preferences with some appropriate default information. In this paper we propose a case-based approach to alleviating the preference elicitation bottleneck. Assuming the existence of a population of users from whom we have elicited complete or incomplete preference structures, we propose eliciting the preferences of a new user interactively and incrementally, using the closest existing preference structures as potential defaults. Since a notion of closeness demands a measure of distance among preference structures, this paper takes the first step of studying various distance measures over fully and partially specified preference structures. We explore the use of Euclidean distance, Spearman's footrule, and define a new measure, the probabilistic distance. We provide computational techniques for all three measures.
Similarity Measures on Preference Structures, Part II: Utility Functions
Ha, Vu A., Haddawy, Peter, Miyamoto, John
In previous work cite{Ha98:Towards} we presented a case-based approach to eliciting and reasoning with preferences. A key issue in this approach is the definition of similarity between user preferences. We introduced the probabilistic distance as a measure of similarity on user preferences, and provided an algorithm to compute the distance between two partially specified {em value} functions. This is for the case of decision making under {em certainty}. In this paper we address the more challenging issue of computing the probabilistic distance in the case of decision making under{em uncertainty}. We provide an algorithm to compute the probabilistic distance between two partially specified {em utility} functions. We demonstrate the use of this algorithm with a medical data set of partially specified patient preferences,where none of the other existing distancemeasures appear definable. Using this data set, we also demonstrate that the case-based approach to preference elicitation isapplicable in domains with uncertainty. Finally, we provide a comprehensive analytical comparison of the probabilistic distance with some existing distance measures on preferences.
The Design of Computer Experiments of Complex Adaptive Social Systems for Risk Based Analysis of Intervention Strategies
Duong, Deborah V. (Agent Based Learning Systems)
Computational social science, as with all complex adaptive systems sciences, involves a great amount of uncertainty on several fronts, including intrinsic arbitrariness such as that due to path dependence, disagreement on social theory and how to capture it in software, input data of different credibility that does not exactly match the requirements of software because it was gathered for another purpose, and inexactly matching translations between models that were designed for different purposes than the study at hand. This paper presents a method of formally tracking that uncertainty, keeping the data input parameters proportionate with logical and probabilistic constraints, and capturing proportionate dynamics of the output ordered by the decision points of policy change, for the purpose of risk-based analysis. Once ordered this way, the data can be compared to other data similarly expressed, whether that data is from simulation excursions or from the real world, for objective comparison and distance scoring at the level of dynamic patterns as opposed to single outcome validation. This method enables wargame adjudicators to be run out with data gleaned from the wargame, enables data to be repurposed for both training and testing set, and facilitates objective validation scoring through soft matching. Artificial intelligence tools used in the method include probabilistic ontologies with crisp and Bayesian inference, game trees that are multiplayer non-zero sum and decision point based rather than turn-based, and Markov processes to represent the dynamic data and align the models for objective comparison.
Extended Grassmann Kernels for Subspace-Based Learning
Subspace-based learning problems involve data whose elements are linear subspaces of a vector space. To handle such data structures, Grassmann kernels have been proposed and used previously. In this paper, we analyze the relationship between Grassmann kernels and probabilistic similarity measures. Firstly, we show that the KL distance in the limit yields the Projection kernel on the Grassmann manifold, whereas the Bhattacharyya kernel becomes trivial in the limit and is suboptimal for subspace-based problems. Secondly, based on our analysis of the KL distance, we propose extensions of the Projection kernel which can be extended to the set of affine as well as scaled subspaces. We demonstrate the advantages of these extended kernels for classification and recognition tasks with Support Vector Machines and Kernel Discriminant Analysis using synthetic and real image databases.